Exploratory Mining and Pruning Optimizations of Contrained Associations Rules
نویسندگان
چکیده
From the standpoint of supporting human-centered discovery of knowledge, the present-day model of mining association rules suffers from the following serious shortcomings: (i) lack of user exploration and control, (ii) lack of focus, and (iii) rigid notion of relationships. In effect, this model functions as a black-box, admitting little user interaction in between. We propose, in this paper, an architecture that opens up the black-box, and supports constraintbased, human-centered exploratory mining of associations. The foundation of this architecture is a rich set of constraint constructs, including domain, class, and SqLstyle aggregate constraints, which enable users to clearly specify what associations are to be mined. We propose constrained association queries as a means of specifying the constraints to be satisfied by the antecedent and consequent of a mined association. In this paper, we mainly focus on the technical challenges in guaranteeing a level of performance that is commensurate with the selectivities of the constraints in an association query. To this end, we introduce and analyze two properties of constraints that are critical to pruning: ontimonotonicity and succinctness. We then develop characterizations of various constraints into four categories, according to these properties. Finally, we describe a mining algorithm called CAP, which achieves a maximized degree of pruning for all categories of constraints. Experimental results indicate that CAP can run much faster, in some cases as much as 80 times, than several basic algorithms. This demonstrates how important the succinctness and anti-monotonicity properties are, in delivering the performance guarantee.
منابع مشابه
Exploratory Mining and Pruning Optimizations of Constrained Associations Rules
From the standpoint of supporting human-centered discovery of knowledge, the present-day model of mining association rules suuers from the following serious shortcomings: (i) lack of user exploration and control, (ii) lack of focus, and (iii) rigid notion of relationships. In eeect, this model functions as a black-box, admitting little user interaction in between. We propose, in this paper, an ...
متن کاملAssociation Rule Mining with Apriori and Fpgrowth Using Weka
Association rule mining is considered as a Major technique in data mining applications. It reveals all interesting relationships, called associations, in a potentially large database. However, how interesting a rule is depends on the problem a user wants to solve. Existing approaches employ different parameters to guide the search for interesting rules. Class association rules which combine ass...
متن کاملOn pruning strategies for discovery of generalized and quantitative association rules
Mining association rules has become an important datamining task, and meanwhile many algorithms have been developed which often differ in several aspects. In this paper, we analyse and compare the pruning strategies of several algorithms that were designed for mining generalised and quantitative association rules while abstracting from other technical details. Furthermore, we sketch a novel pru...
متن کاملIndexed Enhancement on GenMax Algorithm for Fast and Less Memory Utilized Pruning of MFI and CFI
The essential problem in many data mining applications is mining frequent item sets such as the discovery of association rules, patterns, and many other important discovery tasks. Fast and less memory utilization for solving the problems of frequent item sets are highly required in transactional databases. Methods for mining frequent item sets have been implemented using a prefix-tree structure...
متن کامل